f_lynx.python.stuff: September 2008

I've been hitting this problem for a couple of years now...

here it is, in short: we have an algorithm or a set of algorithms that operate on one or several "terms", this is usually called a framework and implemented as a relatively monolithic (usually but not necessarily) entity with several customization points, and is used by replacing/customizing these points to achieve a variation of the provided functionality. there are numerous examples of this approach among which are the Python's logging package and the Twisted framework, to name a couple of big ones, on the smaller scale there are things like UserDict, ...etc. this approach works quite well but has two rather big weaknesses:

the bigger things get the more unnecessary complexity of design and implementation there are and in some (if not most) cases this also relates to complexity of use...

a framework is extremely hard to to implement in a way for it to be reused several times within one context...

the second point may need a little explaining. let us consider this example: we have several frameworks that simplify pythons dict creation and use, these usually require the user to define only tree or four special methods and implement the rest of the dict interface on top of that... quite straight forward, indeed. so to demo the above concept let's try and implement an object that acts as a dict and at the same time responds to a special attribute interface, both protocols access different data stores and are essentially independent, but are almost identical in their base functionality.

so some of the traditional approaches to solve this are:

implement everything by hand.

as far as my experience goes this appears to be one of the most popular ways to solve things. this is quite boring and bad so I'll skip the details and code example.

extend and wrap (the OOP way).

In other words make the object a dict-like by extending a special framework and proxy the attr access to a nested container (or vice versa).

here is an example:


import pli.pattern.proxy.utils as putils
import pli.pattern.mixin.mapping as mapping

# mapping.Mapping is quite similar to UserDict form the standard library but is quite a bit more 
# flexible. though, for this example it makes almost no difference except for the base methods
# needed by each library are a but different.
class DictLikeWithAttrs(mapping.Mapping):
    _dict_data = None
    # to add some fun, all of the instances have the same "base" state, yet still can store private 
    # data in their local namespace...
    _attr_data = {}

    def __init__(self):
        self._dict_data = {}

    # proxy the dict interface...
    putils.proxymethods((
            '__getitem__',
            '__setitem__',
            '__delitem__',
            '__iter__',
        ), '_dict_data')

    # proxy the attr methods to the dict interface of the _attr_data...
    putils.proxymethods((
            ('__getattr__', '__getitem__'),
            ('__delattr__', '__delitem__'),
        ), '_attr_data')
    # and we need to take special care with __setattr__ to avoid infinite recursion...
    def __setattr__(self, name, val):
        if name in ('_attr_data', '_dict_data'):
            return super(DictLikeWithAttrs, self).__setattr__(name, val)
        self._attr_data[name] = val

    # this will set local attrs that have priority over the global state...
    putils.proxymethod(
        ('setlocal', '__setitem__'), '__dict__')

this approach has both it's advantages:

involves far less work than the first approach.

also is simpler, if proper tools and discipline are used.

and disadvantages:

the result is not too reusable or configurable.

not simple enough (IMHO).

scales badly -- we essentially have to proxy all the methods by hand.

another reason why this scales badly is that you can extend only once per protocol and for each consecutive use you need to nest.

just imagine how would an object that supports three or four different mapping-like interfaces look like...

write an interface factory.

let's go straight to an example:


def mappingproxy(target, get='__getitem__', set='__setitem__', delete='__delitem__', iter='__iter__', depth=1):
    map = ()
    # these will enable us to disable some methods...
    if get != None:
        map += ((get, '__getitem__'),)
    if set != None:
        map += ((set, '__setitem__'),)
    if delete != None:
        map += ((delete, '__delitem__'),)
    if iter != None:
        map += ((iter, '__iter__'),)
    # sanity check...
    if map == ():
        raise TypeError, 'to generate a proxy we need at least one methods enabled.'
    # now generate the proxies...
    putils.proxymethods(map, target, 
        # this sets the namespace depth of the caller... (an off-topic, but in case you wounder :) )
        depth=depth+1)

def attrproxy(target, get='__getattr__', set='__setattr__', delete='__delattr__', iter=None, depth=1):
    return mappingproxy(target, get=get, set=set, delete=delete, iter=iter, depth=depth+1)

so our class will look like this... (the changes are in italic)


class DictLikeWithAttrs2(mapping.Mapping):
    _dict_data = None
    # to add some fun, all of the instances have the same "base" state, yet still can store private data...
    _attr_data = {}

    def __init__(self):
        self._dict_data = {}

    mappingproxy('_dict_data')

    attrproxy('_attr_data', set=None)
    # and we need to take special care with __setattr__ to avoid infinite recursion...
    def __setattr__(self, name, val):
        if name in ('_attr_data', '_dict_data'):
            return super(DictLikeWithAttrs2, self).__setattr__(name, val)
        self._attr_data[name] = val

    # this will set local attrs that have priority over the global state...
    putils.proxymethod(
        ('setlocal', '__setitem__'), '__dict__')

this is a good approach, we made the user code both shorter and more understandable (be it at an expense of adding some more complexity to the library). but, on the down side, our factory and generated code are rather static.

in essence, by automating part of the process we made the whole thing less flexible.

design the interface "stackable" form the ground up.

this is really bad as it makes everything dramatically more complex by adding dispatching to everything and confusing option and data registries/dicts/lists...

macros or templates.

I'd avoid this at this point as I'm investigating a possibility of a clean OOP approach.

I regularly use the second and third approaches but, I'm looking for something that would feel more natural...

In my view the best way to go would be defining a framework in a way that would isolate the algorithms from the terms they use, making the later overloadable.
On the small scale this is the same as a simple function, where the functionality is black-boxed, the user just inputs the arguments and gets the result; but this does not scale to collections of functions or frameworks with ease. I can't seem to find anything that would fit this in any of the systems/languages I know of (but I could be missing a thing or two).

Any Ideas? :)

P.S. you can find pli here...

f_lynx.python.stuff

Saturday, September 20, 2008

web and computers...

Sunday, September 07, 2008

OOP and framework cooperative reuse...

About Me

Other Blogs by f_lynx

Previous Posts

Archives